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Probing dark energy with future redshift surveys: A comparison of 
emission line and broad band selection in the near infrared 
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ABSTRACT 

Future galaxy surveys will map the galaxy distribution in the redshift interval 0.5 < z < 2 
using near-infrared cameras and spectrographs. The primary science goal of such surveys 
is to constrain the nature of the dark energy by measuring the large-scale structure of the 
Universe. This requires a tracer of the underlying dark matter which maximizes the useful 
volume of the survey. We investigate two potential survey selection methods: an emission 
line sample based on the Ha line and a sample selected in the H-band. We present predic- 
tions for the abundance and clustering of such galaxies, using two published versions of the 
GALFORM galaxy formation model. Our models predict that Ha selected galaxies tend to 
avoid massive dark matter haloes and instead trace the surrounding filamentary structure; H- 
band selected galaxies, on the other hand, are found in the highest mass haloes. This has 
implications for the measurement of the rate at which fluctuations grow due to gravitational 
instability. We use mock catalogues to compare the effective volumes sampled by a range of 
survey configurations. To give just two examples: a redshift survey down to Hab = 22 sam- 
ples an effective volume that is ^ 5 — 10 times larger than that probed by an Ha survey with 
log(FH- Q [erg s _1 cm -2 ]) > —15.4; a flux limit of at least log(Ff/ Q [erg s _1 cm -2 ]) = —16 
is required for an Ha sample to become competitive in effective volume. 
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1 INTRODUCTION 

A number of approaches have been proposed to uncover the nature 
of the accelerating expansion of the Universe whic h involve mea- 
suring the large scale distr ibution of galaxies (e.g lAlbrecht et al.l 
l2006t iPeacock et alJuOOq) . The ability of galaxy surveys to dis- 
criminate between competing models depends on their volume. 
Once the solid angle of a survey has been set, the useful volume 
can be maximised by choosing a tracer of the large-scale structure 
of the Universe which can effectively probe the geometrical vol- 
ume. This depends on how the abundance of tracers drops with 
increasing redshift, and how much of this decline is offset by an 
increase in the clustering amplitude of the objects. 

Several wide-angle sur veys have probed the redshift interval 
between < z < 



Colle ss et al.ll2003l : lYork et alj[2000l : 



ICannon et al. I l2006l : Blake et al l l2009h . The next major step up in 
volume will be made when the range from 0.5 < z < 2 is opened 
up with large near-infrared cameras and spectrographs which are 
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mounted on telescopes able to map solid angles running into thou- 
sands of square degrees. From the ground, this part of the elec- 
tromagnetic spectrum is heavily absorbed by water vapour in the 
Earth's atmosphere and affected by the strong atmospheric OH 
emission lines. A space mission to construct an all-sky map of 
galaxies in the redshift range 0.5 < z < 2 would have a significant 
advantage over a ground based survey in that the sky background 
in the near-infrared (NIR) is around 500 times weaker in space than 
it is on the ground. 

An important issue yet to be resolved for a galaxy survey 
extending to z ~ 2 is the construction of the sample and the 
method by which the redshifts will be measured. One option is 
to use slitless spectroscopy and target the Ha emission line. Ha 
is located at a restframe wavelength of A = 6563A, which, for 
galaxies at z > 0.5, falls into the near-infrared part of the electro- 
magnetic spectrum {Thompson et al. 1996; iMcCarthv et al. I I 19991 : 
iHopkins et al.l2000l : IShim et alj2009l) . Ha emission is powered by 
UV ionizing photons from massive young stars. The only source 
of attenuation is dust, which is less important at the wavelength of 
Ha than it is for shorter wavelength lines. This makes Ha a more 
direct tracer of galaxies which are actively forming stars than other 
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lines such as Lya, Oil, OIII, H/3 or H7, which suffer from one or 
more sources of attenuation (i.e. dust, stellar absorption, resonant 
scattering) and which are more sensitive to the metallicity and ion- 
isation state of the gas. The second option is to use some form of 
multi-slit spectrograph to carry out a redshift survey of a magnitude 
limited sample. The use of a slit means that unwanted background 
is reduced, allowing fainter galaxies to be targetted. Also, it is eas- 
ier to identify which spectrum belongs to which galaxy with a slit 
than it is with slitless spectroscopy. Targets could be selected in the 
H-band at an effective wavelength of just over 1 micron, which is 
around the centre of the near infrared wavelength part of the spec- 
trum. 

Space missions designed to carry out redshift surveys like the 
ones outlined above are currently being planned and assessed on 
both sides of the Atlantic. At the time of writing, the European 
Space Agency is conducting a Phase A study of a mission proposal 
called EuclidQ one component of which is a galaxy redshift survey. 
Both of the selection techniques mentioned above are being eval- 
uated as possible spectroscopic solutions. The slit solution for Eu- 
clid is based on a novel application of digital micromirror devices 
(DMDs) to both image the galaxies to build a parent catalogue in 
the H-band and to measure their redshifts (see Cimatti et al. 2009 
for further details about the Euclid redshift survey). A Ha mission 
is also being discussed in the USA0. At this stage, the sensitivity 
of these missions is uncertain and subject to change. For this rea- 
son we consider a range of Ha flux limits and H-band magnitudes 
when assessing the performance of the surveys. The specifications 
and performance currently being discussed for these missions have 
motivated the range of fluxes that we consider. 

A simple first impression of the relative merits of different 
selections methods can be gained by calculating the effective vol- 
ume of the resulting survey. This requires knowledge of the sur- 
vey geometry and redshift coverage, along with the redshift evolu- 
tion of the number density of sources and their clustering strength. 
In this paper we use published galaxy formation models to pre- 
dict the abundance and clustering of different samples of galaxies 
in order to compute the effective volumes of a range of Ha and 
H-band surveys. Observationally, relatively little is known about 
the galaxy population selected by Ha emission or H-band magni- 
tude at 0.5 < z < 2. Empirically it is possible to estimate the 
number density of sources from the available luminosity function 
data and, on adopting a suitable model, to use the limited cluster- 
ing measurements curren tly available to infer the evolution of the 
number density an d bias fehiova et alj|2008l ; iMorioka et ai]|2008l ; 
iGeach et al ] |2008h . Geach et al. (2009), in a complementary study 
to this one, make an empirical estimate of the number density of 
Ha emitters, and combine this with the predictions of the clustering 
of these galaxies presented in this paper to estimate the efficiency 
with which Ha emitters can measure the large scale structure of the 
Universe. 

The outline of the paper is as follows: in Section [2] we give a 
brief overview of the models. Some general properties of Ha emit- 
ters in the models, such as luminosity functions (LF), equivalent 
widths (EW) and clustering bias are presented in Section[3]as these 
have not been published elsewhere. In Section|4]we show how our 
models can be used to build mock survey catalogues. We analyse 
the differences in the clustering of Ha emitters and H-band selected 
galaxies and present an indication of the efficiency with which dif- 
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ferent surveys trace large-scale structure (LSS). Finally, we give 
our conclusions in Section|5] 



2 THE MODELS 

In this paper we present predictions for the clustering of galaxy 
samples selected in the near-infrared using two published versions 
of the semi-analytic model GALF0RM. An overview of the semi- 
analytical approach to modelling galaxy formation ca n be found 
in Bau gh (2006). The GALFO RM code is described in ICole et al.l 
( feOOOh and lBenson e"tai] < l2003l) . The two models considered in this 
paper are explained fully in the orig inal pap ers, Baug het al. I l200l) 
(hereafter the Bau05 model) and iBower et al. I d2006h (hereafter 
the Bow0 6model). A thorough description of the ingredients of 
the Bau05 model can also be found in Lacey et al. (2008); de- 
tailed comparisons of the physical ingredi ents of the two models 
are given in Almeida et al. (20 07, 2008), iGonzalez et"al] fe009l) 
and lGonzalez-Perez et alj | |2009|) . Here we give an overview of the 
main features of each model and refer the reader to the above ref- 
erences for further details. 

The models are used to calculate the properties of the galaxy 
population as a function of time, starting from the merger histories 
of dark matter haloes and invoking a set of rules and recipes to de- 
scribe the baryonic physics. These prescriptions require parameter 
values to be set to define the model. These values are set by com- 
paring the model predictions against observations of local galaxies. 
The BauO 5 and BowO 6 models have many ingredients in common 
but differ in the way in which they suppress the formation of bright 
galaxies. Also, different emphasis was placed on reproducing var- 
ious local datasets when setting the parameters of the two models. 
It is important to remember that our starting point here is the two 
"off the shelf" galaxy formation models, which were set up with- 
out reference to Ha or H-band observations. In view of this it is 
remarkable how close these models come to matching the observed 
HaLFs and H-band counts and redshift distributions, as presented 
in the next sections. 

The BauO 5 model uses a superwind to stifle the formation of 
bright galaxies. The rate of mass ejection is assumed to be pro- 
portional to the star formation rate. The superwind ejects baryons 
from small and intermediate mass haloes. The cooling rate in mas- 
sive haloes is reduced because these haloes have a reduced baryon 
fraction, due to the operation of the superwind in their progenitors. 
The model assumes that star formation which takes place in bursts 
occurs with a top-heavy initial mass function (IMF). For each solar 
mass of stars formed, four times the number of Lyman continuum 
photons are produced in a starburst as would be made in a quiescent 
episode of star formation, in w hich stars are pro duced with a stan- 
dard solar neighbourhood IMF dKennicuttll 1 9 8 3l) . Highlights of the 
BauO 5 model include matching the observed number counts and 
redshift distribution of sub-millimetre sources and the luminosity 
function of Lyman-break galaxies. The BauO 5 model also suc- 
cessfully reproduces the a bundance and properties (including clus- 
tering ) of Lya emitters dLe Delliou et al J 12001 120061 : lOrsi et al.l 
120081) . 

The BowO 6 model, on the other hand, uses feedback from ac- 
tive galactic nuclei (AGN) to stop the formation of bright galaxies. 
The accretion of "cooling flow" gas directly onto a central super- 
massive black hole releases jets of energy which heat the hot gas, 
and greatly reduces the cooling flow (see Croton et al. 2005). Hence 
the supply of cooling gas for star formation is switched off. The 
BowO 6 model gives a good match to the bimodal nature of the 
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colour distribution of local galaxies (Gonzalez et al. 2009), to the 
abundance of red galaxies (Almeida et al. 2007; Gonzalez-Perez 
et al. 2009) and to the evolution of the stellar mass function (Bower 
et al. 2006). 

Other differences between the two models include: i) star- 
bursts triggered by dynamically unstable disks in the Bow0 6 
model; ii) a universal solar neighbourhood IMF in the Bow0 6 
model; iii) the use of dark matter halo merger histories extracted 
from an N-body simulation in the BowO 6 model, whereas the 
Bau05 model uses Monte-Carlo generated trees; iv) a slightly dif- 
ferent set of cosmological parameters (fi m = 0.3, Qa = 0.7, 
£l b = 0.04, h = 0.7, as = 0.9 for the Bau05 model, and 
fi m = 0.25, fi A = 0.75, fib = 0.045, h = 0.73, a 8 = 0.93 
for the BowO 6 model). 

The calculation of H-band flux and Ha line emission is the 
same in both models. The model predicts the star formation history 
of each galaxy, recording the star formation rate and the metallic- 
ity with which stars are made in each of the galaxy's progenitors. 
This allows a composite stellar population and spectral energy dis- 
tribution to be built up. The model predicts the scale size of the 
galaxy and, through a chemical evolution model, the metal con- 
tent of the disk and bulge. The H-band magnitude is computed by 
convolving the model galaxy spectral energy distribution with an 
H-band filter, appropriately shifted in wavelength if the galaxy is 
observed at z > 0. The effect of dust extinction is taken into ac- 
count by assuming that the dust and disk stars are mixed together 
(Cole et al. 2000). The spectral energy distribution also gives the 
rate of production of Lyman continuum photons. Then, all of the 
ionizing photons are assumed to be absorbed by the neutral gas 
in the galaxy, and, by adopting case B recombination (Osterbrock 
1989), the emissivity of the Ha line (and other emission lines) is 
computed. Here we assume that the attenuation of the Ha emission 
is the same as that experienced by the continuum at the wavelength 
of Ha . To predict the equivalent width (EW) of the Ha emission, 
we simply divide the luminosity of the line by the luminosity of the 
continuum around the Ha line. 



3 PROPERTIES OF Ha EMITTERS 

We first concentrate on the nature of Ha emitters in the models, 
which have not been discussed elsewhere for GALFORM , before 
examining the clustering of Ha and H-band selected samples in 
more detail in the next section. In this section we present the basic 
predictions for the abundance, equivalent width distributions and 
clustering of Ha emitters. Note that all the results presented here 
include the attenuation of the Ha emission by dust in the ISM at 
the same level experienced by the continuum at the wavelength of 
Ha . 



3.1 The Ha luminosity function 

A basic prediction of the models is the evolution of the Ha lu- 
minosity function (LF). Fig. [JJ shows the Ha LFs predicted by 
the two versions of GALFORM compared with observational data, 
over the redshift interval < z < 2. At each redshift plot- 
ted, the Bau05 model predicts a higher number density of Ha 
emitters than the BowO 6 model for luminosities brighter than 
log(Z/£f a [erg s -1 cm -2 ]) ~ 42. This reflects two processes: the 
relative efficiency of the feedback mechanisms used in the two 
models to suppress the formation of bright galaxies, and the top- 
heavy IMF adopted in starbursts in the BauO 5 model, which, for a 



galaxy with a given star formation rate, boosts the Ha flux emitted. 
The bright end of the Ha LF is dominated by bursting galaxies. 

At faint luminosities, Fig. [JJ shows that the predicted model 
LFs are more similar. At these luminosities, the star formation 
in both models predominantly takes place in galactic disks and 
produces stars with a standard IMF. For luminosities fainter than 
log(Z/£f a [erg s -1 cm -2 ]) ~ 40, the Bow06 model suffers from 
the limited mass resolution of Millennium Simulation halo merger 
trees (Springel et al. 2005) com pared with that of the Monte Carlo 
trees used in the BauO 5 model l lHellv et alj200l) . 

The observational d ata sh own in Fig. [JJ c omes from 
Ijones & Bland-Hawthonil ll200ll) for z ~ 0; iFuiita et al.l 
j2003l)lHippelein et al l <2003l) Ijones & Bland-HawthorrJ (1200 ll) . 
iMorioka et al.l <2008h|Pascual et alj (feOOlhlShiova et al.l J2008h~for 



0.2; iTresse et alj ( |2002|) , IVillar et al.l J2008h 



Sobral et al 



( |2009 | ) | Shim et al d2009h for z ~ 0.9 and iGeach et al 
< l2008l)lshim et alj J2009h for z = 2.2. Most of this observational 
data has not been corrected by the authors for dust extinction, and 
hence it can be directly compared to the GALFORM predictions, 
which include dust attenuation. However, in some cases the data 
were originally presented after correction for an assumed constant 
attenuation. In such cases we have undone this "correction". Hence, 
our comparison concerns the actual observed number of Ha emit- 
ters, which is the relevant quantity for assessing the performance of 
a redshift survey. 

In general both models overpredict the number of low lu- 
minosity Ha emitters at z ^ 0.3, as shown by Fig. QJ At 
2 = 0, (upper-left panel in Fig. [JJ, the amplitude of the LF 
in both models is larger, by almost an order of magnitude, than 
the|j ones & Bland-Hawthorrj ( 1200 it) data. A similar conclusion is 
reached at z = 0.2 (upper-right panel in Fig.[JJ, on comparing the 
models to most of the observational data. However, there is a signif- 
icant scatter in observations of the faint end of the LF. At redshifts 
z > 1 (bottom panels in Fig. [JJ, the models bracket the observa- 
tional estimates, with the BowO 6 model tending to underpredict 
the observational LF, whereas the BauO 5 model over predicts it. 
Despite the imperfect agreement, these model LFs "bracket" the 
observed LFs for the redshifts relevant to space mission surveys 
propsed, so we proceed to use them for the purposes of this paper. 



3.2 Ha equivalent width (EW) distribution 

Broadly speaking the EW of the Ha line depends on the current 
SFR in a galaxy (which determines the Ha emission), and its stel- 
lar mass (to which the continuum luminosity is more closely re- 
lated). We compare the model predictions for the EW of Ha versus 
Ha flux with observational results in Fig. [2] The o bservational data 
cover a wide redshift interval, 0.7 < z < 1.9 l lMcCarthv et alj 
1 19991 : iHopkins et ai1l2000l : | Shim et al .120091) . In order to mimic the 
observational selection when generating model predictions, we go 
through the following two steps. First, we run the models for a set 
of redshifts covering the above redshift range. Second, we weight 
the EWobs distribution at a given flux by dN/dz, the redshift dis- 
tribution of Ha emitters over the redshift range, to take into account 
the change in the volume element between different redshifts (see 
Section|4]for details of the calculation of dN/dz). 

Fig.[2]shows the EWobs distribution predicted by the Bau05 
model (top panel) and the BowO 6 model (bottom panel). The mod- 
els predict different trends of EWobs with Ha flux. In the Bau05 
model, the typical EW increases with Ha flux, with a median 
value close to EW b a ~ 100A at log(Fff Q [erg s" 1 cm" 2 ]) = 
-18, reaching EW obs ~ 2000A at log(i^ a [erg s _1 cm" 2 ]) = 
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Figure 1. The Ha luminosity function, including attenuation by dust, at different redshifts. The blue curves show the predictions of the Bau05 model, 
whereas red curves show the Bow0 6 model. The observational estimates are represented by the symbols (see text for details). The redshift displayed in 
the bottom-right corner of each panel gives the redshift at which the GALFORM models were run. The vertical black dashed line shows the Ha luminosity 
corresponding to the flux log(F// a [erg s — 1 cm -2 ]) = —15.4 for z > 0, displayed to show the expected luminosity limit of current planned space missions. 



— 14. In contrast, the Bow0 6 model predicts a slight decline of 
EWobs with Ha flux until very bright fluxes are reached, with me- 
dian EWobs ~ 100A in the range log(FH Q [erg s -1 cm -2 ]) = 
[-18,-15]. For log(F HQ [org s -1 cm -2 ]) > -15, the Bow0 6 
model predicts a sharp increase of the median EWoba to ~ 3000A. 
The 95% interval of the EW ohs found in GALFORM galaxies (the 
light grey region in Fig. O covers almost 2 orders of magnitude 
in both models, except in the plateau found in the brightest bin of 
the BowO 6 model, where the distribution covers 3 orders of mag- 
nitude. The Bau05 model matches the observed distribution of 
equivalent widths the best, particularly after the rescaling of con- 
tinuum and line luminosities discussed in the next section (after 
which the median EW versus Ha distribution shifts from the solid 
black to the dashed magenta line). It is interesting to note that the 
"shifted" relations (see §4) give a better match to the observations 
for both models (although the Bau05 model remains a better fit), 
particularly as the shift was derived with reference to the H-band 



galaxy number counts (for the continuum) and to the z ~ 1 Ha LF, 
rather than to the EW data. 

3.3 Clustering of Ha emitters: effective bias 

The clustering bias, 6, is defined as the square root of the ratio of the 
galaxy correlation fu nction to the correlation function of the dark 
matter jKaiserlll984h . As we shall see in Section 4.3, the clustering 
bias is a direct input into the calculation of the effective volume of 
a galaxy survey, which quantifies how well the survey can measure 
the large scale structure of the Universe. Simulations show that the 
correlation functions of galaxies and dark matter reac h an approxi- 
mately constant ratio on large scales (see for example fAngulo et al.l 
l2008al ; note, however, that small departures from a constant ratio 
are apparent even on scales in excess of 100/i _1 Mpc). 

In this section we compute the effective bias of samples 
of Ha emitting galaxies. There are theoretical prescriptions for 
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Figure 2. The distribution of Ha equivalent width in the observer frame 
as a function of Ha flux, over the redshift interval 0.7 < z < 1.9. The 
top panel shows the predictions of the Bau05 model and the bottom panel 
shows the BowO 6 model, calculated as described in the text. The black line 
shows the median EW at each flux. The shaded regions enclose 68% (dark 
grey) and 95% (light grey) respectively of the GALFORM predictions around 
the median (b l ack cir cles). The blue circles show obse rvational da t a from 
iHopkins et all feOOOl) , green aster isks show data from [shim et al ] fc009l) 
and red diamonds show data from lMcCarthv et"al] jl999l) , as indicated by 
the key. The magenta dashed lines show the GALFORM predictions for the 
median equivalent width after applying the empirically derived continuum 
flux and line luminosity rescalings described in Section|4] 



calculating the bias factor of dark matter haloes as a func- 
tion of mass and redshift dCole & Kaiser] 1 19891 ; IMo & White] 
1 19961 : ISheth. Mo & TormerJ [20oTK T These have been extensively 
tested against the clustering of haloes measured in N-body 
simulations and have been found to be reasonably accurate 
dGao. Springel & White! 1 20051; IWechsler et alj|2006t lAngulo et al] 
l2008bh . Here we use ISheth. Mo & TormerJ d200ll) The effective 
bias is computed by integrating over the halo mass the bias fac- 
tor corresponding to the dark matter halo which hosts a galaxy 
multiplied by the abun dance of the galaxies of the chosen lumi- 
nosity (see, for example iBaugh et al] 19991 ; iLe Delliou et aljfeood : 
lOrsi et al]|2008h . 

Fig. [3] shows the predicted galaxy bias, 6 e g, as a function 
of Ha luminosity over the redshift interval < z < 2. There 
is a clear increase in the value of the effective bias with red- 
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Figure 3. The effective bias parameter as a function of Ha luminosity for 
redshifts spanning the range < z < 2. The Bau05 model results are 
shown using circles connected with solid lines and the BowO 6 model re- 
sults are shown with asterisks connected by dashed lines. Each colour cor- 
responds to a different redshift, as indicated by the key. 
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Ccont 


Bau05 


0.35 


0.73 


Bow06 


1.73 


0.42 



Table 1. Luminosity rescaling factors for the Ha line and the stellar con- 
tinuum. Column 2 shows Cua, the factor used to adjust the predicted Ha 
flux as described in the text. This factor is only applied to the Ha line. Col- 
umn 3 shows Ccont, the correction factor applied to the stellar continuum, 
as derived by forcing the model to match the observed H-band counts at 
Has = 22. This factor is applied to the entire stellar continuum of the 
model galaxies. 



shift; at \og(L Ha [erg s" 1 cm -2 ]) = 40, b cB w 0.8 at z = 0, 
compared with b e g m 1.5 at z = 2. Both models show an up- 
turn in the effective bias with decreasing luminosity faintwards of 
log (I/Ha [erg s -1 cm -2 ]) = 40. There is little dependence of bias 
on luminosity brightwards of log(I/Ha[erg s _1 cm~ 2 ]) = 40, up 
to 2 = 2. The predictions of the two models for the effective bias 
are quite similar. There are currently few observational measure- 
ments of the clustering of Ha emitters. iGeach et"al] d2008h inferred 
a spatial correlation length of rrj = 4.2^Q'2/i~ 1 Mpc for their sam- 
ple of 55 Ha emitters at z = 2.23. This corresponds to a bias of 
b ~ 1.7 in the Bau05 model cosmology, which is in very good 
agreement with the predictions plotted in Fig. [3] 



4 THE EFFECTIVENESS OF REDSHIFT SURVEYS FOR 
MEASURING DARK ENERGY 

In this section we assess the relative merits of using Ha or H- 
band selection to construct future redshift surveys aimed at mea- 
suring the dark energy equation of state. The first step is to produce 
a mock catalogue that can reproduce currently available observa- 
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Figure 4. Number counts in the H band. The upper panel shows the differ- 
ential counts on a log scale. The lower panel shows the counts after dividing 
by a power law N Ic f oc H ( ^ 2 to expand the dynamic range on the y-axis. 
The symbols show the observational data, as shown by the key in the up- 
per panel. The lines show the model predictions. The dotted lines show the 
original GAL FORM predictions for the Bau05 model (blue) and the Bow06 
model (red). The solid curves show the rescaled GALFORM predictions af- 
ter rescaling the model galaxy luminosities to match the observed number 
counts at Hab = 22. 




Figure 5. The redshift distribution of galaxies with -Hab = 22 (left col- 
umn) and if ab < ^3 (right column). The top panels show the predictions 
after rescaling the model luminosities to better match the number counts 
as explained in the text. Red and blue lines show the model predictions for 
-Hab < 22 and .Hab < 23 respectively. Solid lines show the Bau0 5 (r) 
model and the dashed lines show the Bow0 6 (r) model. The lower panel 
shows the redshift distribution obtained from the BowO 6 model by diluting 
the galaxies, randomly selecting 0.63 of the sample, the BowO 6 (d) model 
(recall this is a purely illustrative case with no physical basis; see §4.1.1). 
In both panels, the histogram shows an estimate of the redshift distribution 
derived from spectros copic observations in the COSMOS and UDF fields 
ICirasuo lo et al. 200§|, ; Euclid-NIS Science Team, private communication). 



tions. We discuss how we do this in Section 4.1. We then present 
predictions for the clustering of Ha emitters and H-band selected 
galaxies in Section 4.2. We quantify the performance of the two 
selection methods in terms of how well the resulting surveys can 
measure the large-scale structure of the Universe in Section 4.3. 



as the clustering strength of the galaxies. Hereafter we will refer 
to the adjusted Bau05 and Bow06 models as Bau05 (r) and 
BowO 6 (r) respectively, to avoid confusion. We also consider a 
sparsely sampled version of the BowO 6 model, which we refer to 
asBow06(d) (see §4.1.1). 



4.1 Building accurate mock catalogues 

Our goal in this section is to build mock catalogues for future 
redshift surveys which agree as closely as possible with currently 
available observational data. We have already seen that the mod- 
els are in general agreement with observations of the Ha lumi- 
nosity function, and will see in the next subsection how well the 
models match the H-band number counts. In our normal mode of 
operation, we set the model parameters with reference to a sub- 
set of local observations and see how well the model then agrees 
with other observables. This allows us to test the physics of the 
model; if the model cannot reproduce a dataset adequately, perhaps 
some ingredient is missing from the model (e.g. for an appl ication 
of this principle to galaxy clustering, see iKim et alj|200^) . Here 
our primary aim is not to develop our understanding of galaxy for- 
mation physics but to produce a synthetic catalogue which resem- 
bles the real Universe as closely as possible. To achieve this end 
we allow ourselves the freedom to rescale the model stellar contin- 
uum and emission line luminosities, independently. This preserves 
the ranking of the model galaxies in luminosity. This approach is 
more powerful than an empirical model as we retain all of the ad- 
ditional information predicted by the semi-analytical model, such 



4.1.1 H-band selected mock catalogues 

In Fig. [4] we first compare the model predictions without any 
rescaling of the luminosities against a compilation of observed 
number counts in the H-band, kindly provided by Nigel Metcalfe. 
Observational data are taken from the following sources, shown 
with different symbols: Bl ack plus-signs fr om iMetcalfe et al.1 
d2006[); purple asterisks fro m lFrith et al.1 d2006l); purple diamond s 
from IMetcalfe et all J2006h; blue t riangl es from I Yan et ail dl998l) : 
blue squares from iTeplitz et al.1 dl998h ; cyan crosses from the 
second data rele a se of the 2MASS Survey Q; gr een circles from 
iThompson etal] dl999j)^_ green p l us-sign s from iMartinil j200lh ; 
green asteri sks from Ichen et al.1 d2002l) : green diamonds from 
IMqvI d2003l) ; green triangles fro m the 2MASS ext ended source 
catalogutQ o range squares from jFrith et alj d2006t) . and orange 
triangles from lRetzlaff et al.1 \20Q% 

There is a factor of three spread in the observed counts around 



1 http://www.ipac.caltech.edU/2mass/releases/second/#skycover 

2 http://www.ipac.caltech.edu/2mass/releases/allsky/doc/sec2_3d3.html 
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Figure 6. The Ha LF at z = 0.9. The symbols show observational data, 
with the sources indicated in the key. The dotted curves show the original 
predictions for the Ha luminosity function, as plotted in Fig. [T] The solid 
curves show the model predictions after rescaling the Ha luminosity to 
better match the observed LF at log(Lfj a [erg s" 



corresponds to a flux limit of \og(Fu a [erg s 
redshift. 



cm" 2 ]) 
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= 42, which 
15.3 at this 



Hab = 20—22. The unsealed models agree quite well with the ob- 
servations at Hab = 20 but overpredict the counts at Hab = 22, 
the likely depth of a slit-based redshift survey from space. There are 
two ways in which the model predictions can be brought into better 
agreement with the observed counts at Hab = 22; first, by rescal- 
ing the luminosities of the model galaxies to make them fainter in 
the H-band or second, by artificially reducing, at each magnitude, 
the number density of galaxies. The first correction could be ex- 
plained as applying extra dust extinction to the model galaxies; as 
we will see later on, the typical redshift of the galaxies is z ~ 0.5- 
1, shifting the observer frame H into the rest frame R to V-band. 
The second correction has no physical basis and is equivalent to 
taking a sparse sampling of the catalogue at random, i.e. making a 
dilution of the catalogue. Galaxies are removed at random without 
regard to their size or redshift. (Note that the dissolution of galaxies 
invoked by Kim et al. 2009 only applies to satellite galaxies within 
haloes, and is mass dependent, and hence is very different from the 
random dilution applied here.) The motivation behind the second 
approach is that the shape of the original redshift distribution of the 
model is preserved. As we shall see, the first approach, rescaling 
the model galaxy luminosities, produces a significant change in the 
shape of the predicted redshift distribution. 

It is worth remarking in passing that the semi-analytical mod- 
els used her e have already been compare d to the observed counts in 
the K-band jGonzalez-Perez et alj|2009l) . The Bow0 6 model was 
found to agree very well with the K-band observations whereas the 
BauO 5 model underpredicted the counts by up to a factor of three. 
This is a somewhat different impression about the relative merits of 
the models from that reached on comparing to the observed H-band 
counts, which is surprising given the proximity of the bands and the 
similarity in the masses of the stars which dominate the light from 
the composite stellar populations at these wavelengths. 

The agreement with the observed counts is improved at 
Hab = 22 by shifting the BowO 6 galaxy magnitudes faintwards 
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Figure 7. The redshift distribution of Ha selected galaxies for 3 different 



flux limits: log(i ? jif ct [erg s 



2 ]) > -15.3, -15.7 and -16.0 shown in 



red, blue and green respectively. The solid lines show the Bau05 (r) pre- 
diction and the dashed lines show the BowO 6 (r) predictions. In the top 
panel, galaxies contributing to the redshift distribution have no cut imposed 
on the equivalent width of Ha . In the bottom panel, the model galaxies 
have to satisfy the Ha flux limit and a cut on the observed equivalent width 
of Ha of EW ohB > 100A. 



by +0.92 magnitudes; the Bau05 model requires a more modest 
dimming of +0.33 magnitudes (see Table QJ. 

The redshift distribution of H-band selected galaxy samples 
provides a further test of the models. In Fig. [5] the model pre- 
dictions are compared against an estimate of the redshift distribu- 
tion compiled using observations from the COSMOS survey and 
the Hubble Ultra-Dee p Field for Hab < 22 and Hab < 23 
dCirasuolo et al.l [20081 ; Cirasuolo, Le Fevre and McCracken, pri- 
vate communication). If we focus on the lower panels first, which 
shows diV/dz in the randomly diluted BowO 6 model, denoted as 
Bow06 (d) , it is apparent that the original BowO 6 model predicted 
the correct shape for the redshift distribution of sources, but with 
simply too many galaxies at each redshift. In the upper panel of 
Fig. [5] we see that the models with the shifted H-band luminosities 
give shallower redshift distributions than the observed one. The dif- 
ference between the predicted AN/Az after dimming the luminosi- 
ties or diluting the number of objects has important implications 
for the number density of galaxies as a function of redshift, which 
in turn is important for the performance of a sample in measuring 
the large-scale structure of the Universe. 
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Figure 8. The spatial distribution of galaxies and dark matter in the BowO 6 ( r ) model at z = 1. Dark matter is shown in grey, with the densest regions shown 



with the brightest shading. Galaxies selected by their Ha emission with \og(Fu a [erg s 



2 ]) > -16.00 and and EW obs > IOOA are shown in red 



in the left-hand panels. Galaxies brighter than Bab = 22 are shown in green in the right-hand panels. Each row shows the same region from the Millennium 
simulation. The first row shows a slice of 200/i -1 Mpc on a side and 10/t -1 Mpc deep. The second row shows a zoom into a region of 50/i -1 Mpc on a side 
and 10ft -1 Mpc deep, which corresponds to the white square drawn in the first row images. Note that all of the galaxies which pass the selection criteria are 
shown in these plots. 



4.7.2 Ha-selected mock catalogues 

The original model predictions for the Ha luminosity func- 
tion were presented in Fig. Q] The models cross one an- 
other and match the observed Ha LF at a luminosity of 
log(Z/Ha[erg s -1 cm -2 ]) ~ 41.5. At 2 = 0.9, this corresponds 
to a flux of log(-Ffla [erg s -1 cm" ]) = —15.8. The flux limit 
attainable by Euclid is likely to be somewhat brighter than this, 
although the precise number is still under discussion. For this rea- 
son, we chose to force the models to agree with the observed Ha 
LF at log(Z/_Ha [erg s -1 cm -2 ]) = 42 at z = 0.9, which corre- 
sponds to a flux limit of log(FH a [eTg s -1 cm -2 ]) = —15.3 (see 
Fig. [6}. Before rescaling, the model LFs differ by a factor of three 
at log(LHa[eig s -1 cm -2 ]) ~ 41.5. In the rescaling, the Ha line 
luminosity is boosted in the BowO 6 model and reduced in the case 
of the Bau05 model (see TableQ]for the correction factors used in 
both cases). The latter could be explained as additional dust extinc- 



tion applied to the emission line, compared with the extinction ex- 
perienced by the stellar continuum. The former correction, a boost 
to the Ha luminosity in the BowO 6 model, is harder to explain. 
This would require a boost in the production of Lyman-continuum 
photons (e.g. as would result on invoking a top-heavy IMF in star- 
bursts or an increase in the star formation rate). This would require 
a revision to the basic physical ingredients of the model and is be- 
yond the scope of the current paper. 

After making this correction to the Ha line flux in the 
models, we next present the predictions for the redshift distri- 
bution of Ha emitters. Fig. [7] shows dN/dz for flux limits of 
log(-FV Q [ergs -1 cm -2 ]) = [-15.7, -16.0, -16.3]. The redshift 
distribution of the BowO 6 ( r ) model peaks around z ~ 0.5 and de- 
clines sharply approaching z ~ 2, whereas the Bau 5 ( r ) dN/ dz 
are much broader. The lower panel of Fig.[7]shows the redshift dis- 
tribution after applying the flux limits and a cut on the observed 
equivalent width of EW Q b s = IOOA. (Note that the dN/dz is not 
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sensitive to low EW cuts; similar results to the EW h B > A case 
are obtained with 10A in both models). In the rescaled model, the 
equivalent width changes because the Ha line flux has been ad- 
justed and because the continuum has been altered (by the same 
shift as applied to the H-band). Adding the selection on equivalent 
width results in a modest change to the predicted dN/dz in the 
Bow06 (r) model. In the Bau05 (r) model, the dN/dz shifts 
to higher redshifts. There is no observational data on the redshift 
distribution of Ha emitters to compare against the model predic- 
tions. Geach et al, (2009) make an empirical estimate of the redshift 
distribution, by fitting a model for the evolution of the luminosity 
function to observational data. The luminosity of the characteristic 
break in the luminosity function, L», is allowed to vary, while the 
faint end slope and normalisation are held fixed. The resulting em- 
pirical LF looks similar to the original Bau05 model at z = 0.9, 
and the two have similar redshift distributions. The Ha redshift 
distributions in the BowO 6 ( r ) models are shallower than the em- 
pirical estimate; the Bau05 (r) model has a similar shape to the 
empirical redshift distribution, but with a lower normalisation. It is 
important to realise that the approach of Geach et al. is also model 
dependent, and the choices of model for the evolution of the lumi- 
nosity function and of which observational datasets to match are 
not unique and will have an impact on the resulting form of the 
redshift distribution. 



4.2 The clustering of Ha and H-band selected samples 

The semi-analytic galaxy formation model predicts the number of 
galaxies hosted by dark matter haloes of different mass. In the cases 
of Ha emission, which is primarily sensitive to ongoing star for- 
mation, and H-band light, which depends more on the number of 
long-lived stars, different physical processes determine the num- 
ber of galaxies per halo. The model predicts contrasting spatial 
distributions for galaxies selected according to their Ha emission 
or H-band flux. We compare in Fig. [8] the spatial distribution of 
Ha emitters with fluxes log(i<£f a [erg s _1 cm -2 ]) > —16 and 
EW hs > 100A (red circles) with that of an H-band selected 
sample with Hab < 22 (green circles), in the BowO 6 (r) model 
which is set in the Millennium Simulation. The upper panels of 
Fig. [D show how the different galaxy samples trace the underlying 
cosmic web of dark matter. The lower panels of Fig.[8]show a zoom 
into a massive supercluster. There is a marked difference in how 
the galaxies trace the dark matter on these scales. The Ha emit- 
ters avoid the most massive dark matter structures. At the centre of 
massive haloes, the gas cooling rate is suppressed in the model due 
to AGN heating of the hot halo. This reduces the supply of gas for 
star formation and in turn cuts the rate of production of Lyman con- 
tinuum photons, and hence the Ha emission. The H-band selected 
galaxies, on the other hand, sample the highest mass dark matter 
structures. 

To study the difference in the spatial distribution of galaxies 
in a quantitative way, we compare the clustering predictions from 
the models with observational data. Instead of computing the cor- 
relation function explicitly, we use the same method explained in 
Section 3.3 to calculate the effective bias and use this to derive the 
correlation length, ro, a measure of the clustering amplitude, which 
we define as the pair separation at which the correlation function 
equals unity. The correlation function of galaxies, £ ga i, is related 
to the correlation function of dark matter, £dm, by £ ga i = fr 2 £dm- 
The effective bias is approximate ly constant on larg e scales (e.g. 
Angulo et al. 2008a). We use the ISmith et alj J2003I) prescription 
to generate a nonlinear matter power spectrum in real space. This 
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Figure 9. The correlation length, ro , as a function of redshift for selected 
Ha and H-band samples. Solid and dashed lines show the predictions of 
the Bau05 and Bow06 models respectively. The top panel shows the pre- 



dictions for different Ha limiting fluxes, \og(Fjj a [erg s 



2 D > 



[—16.0, —16.5, —17.0] in green, orange and blue respectively. Observa- 
tional data is shown with symbols. The bottom panel shows the model pre- 
dictions for Hab < [20., 20.5] in orange and blue respectively. In this 
case there are two sets of observational estimates, based on different as- 
sumptions for the evolution of clustering with redshift. 



in turn is Fourier transformed to obtain the two-point correlation 
function of the dark matter, £dm- We can then derive £ ga i for any 
survey configuration by multiplying fdm by the square of the effec- 
tive bias, and then we read off the correlation length as the scale at 
which the correlation function is equal to unity. 

Fig. |9] shows the correlation length in comoving units for 
both Ha and H-band samples at different redshifts, compared 
to observational estimates. Differences in the bias predicted by 
the two models (as shown in Fig. [3} translate into similar differ- 
ences in ro. The correlation length declines with increasing red- 
shift for Ha emitters in the Bau05 (r) model, since the increase 
of the effective bias with redshift is not strong enough to bal- 
ance the decline of the amplitude of clustering of the dark mat- 
ter. For the range of flux limits shown in the top panel of Fig. [9] 
( — 16 < log(Ff/a[erg s _1 cm -2 ]) < —17), ro changes from 
~ 5 - 7 /i _1 Mpc at z = 0.1 to r ~ 3.5 /i _1 Mpc at z = 2.5. On 
the other hand, the BowO 6 (r) model shows a smooth increase 
of ro which depends on flux and redshift. At bright flux limits 
ro evolves rapidly at high redshift, reaching ro = 4.3fo -1 Mpc at 
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z = 2.5. At fainter luminosities the change in correlation length 
with redshift is weaker. 

The currently available observational estimates of the cluster- 
ing of near infrared selected galaxy samples mainly come from an- 
gular clustering. A number of assumptions are required in order to 
derive a spatial correlation length from the angular correlation func- 
tion. First, a form must be adopted for the distribution of sources 
in redshift. Second, some papers quote results in terms of proper 
separation whereas others report in comoving units. Lastly, an evo- 
lutionary form is somet imes assumed for the correlation function 
dGroth&Peebleslll977h . In this case, the results obtained for the 
correlation length depend upon the choice of evolutionary model. 

Estimates of the correlation length of Ha emitters are avail- 
able at a small numbe r of redshifts from narrow band sur- 
veys, as shown in Fig [gldMorioka et al.ll2008l ; IShiova et al]|2008l ; 
iNakaiima et al. 1 120081 : iGeach et alJ I 20081) . These surveys are small 
and sampling variance is not always included in the error bar quoted 
on the correlation length (see Orsi et al. 2008 for an illustration of 
how sampling variance can affect measurements of the correlation 
function made from small fiel ds). The models ar e in reasonable 
agreement with the estimate bv lGeach et alj J2008I) at z = 2.2, but 
overpredict the low redshift measurements. The z — 0.24 measure- 
ments are particularly challenging to reproduce. The correlation 
length of the dark matter in the ACDM model is around 5/i _1 Mpc 
at this redshift, so the z — 0.24 result implies an effective bias of 
b < 0.5. Gao & White (2007) show that dark matter haloes at the 
resolution limit of the Millennium Simulation, M ~ 10 h~ Mq, 
do not reach this level of bias, unless the 20% of the youngest 
haloes of this mass are selected. In the Bow0 6 (r) model, the Ha 
emitters populate a range of halo masses, with a spread in forma- 
tion times, and so the effective bias is closer to unity. Another possi- 
ble explanation for the discrepancy is that the observational sample 
could be contaminated by objects which are not Ha emitters and 
which dilute the clustering signal. 

The bottom panel of Fig. [9] shows the correlation length evo- 
lution for diffe rent H-band selec tions, compared to observational 
estimates from lFirth et al .1 d2002l) . Note that the samples analysed 
by Firth et al. are significantly brighter than the typical samples 
considered in this paper {Hab = 20 versus -Hab = 22). Firth 
et al. use photometric redshifts to isolate galaxies in redshift bins 
before measuring the angular clustering. Two sets of observational 
estimates are shown for each magnitude limit, corresponding to two 
choices for the assumed evolution of clustering. Again the models 
display somewhat stronger clustering than the observations would 
suggest at low redshift. The Bau05 (r) model predicts a cluster- 
ing length which increases with redshift. The Bow 6 (r) model, 
on the other hand, predicts a peak in the correlation length around 
z ~ 0.7, with a decline to higher redshifts. This reflects the form 
of the luminosity - h alo mass relation for galaxy formation models 
with AGN feedback fam et al.l2009h . The slope of the luminosity - 
mass relation changes at the mass for which AGN heating becomes 
important. Coupled with the appreciable scatter in the predicted re- 
lation, this can result in the brightest galaxies residing in haloes of 
intermediate mass. 



4.3 Redshift-space distortions 

The amplitude of gravitationally induced bulk flows is sensitive to 
the rate at which perturbations grow, which depends on the ex- 
pansion history of the universe and the nature of the dark energy 
(IWang||2008l : iGuzzo et alj|2008h . Bulk flows can be measured by 
their impact on the correlation function of galaxies when plotted 



as a function of pair s eparation perpendicu l ar and parallel t o the 
line of sight, f (r CT , rv) jHawkins et al.ll2003l ; lRoss et alj|2007l) . We 
now restrict our attention to the BowO 6 ( r ) model, since this is set 
in the Millennium Simulation and we can measure the clustering of 
the model galaxies directly. As the Millennium simulation has peri- 
odic boundary conditions, we can estimate the correlation function 
as follows: 

£(»V,rv) = 



NnAVv- 



1. 



(1) 
(2) 



where DD a> „ is the number of distinct galaxy pairs in a bin of 
pair separation centred on (rv,7v), Ar CT and Arv are the widths 
of the bins in the r CT and rv directions, respectively, N and n are 
the total number of galaxies and the number density of galaxies in 
the sample, and AV r „, r , corresponds to the volume enclosed in an 
annulus centred on (rv, r n ). Note that to avoid any confusion, here 
we refer to the line of sight separation as rv and use n to denote 
the mathematical constant. 

In redshift surveys, the radial distance to a galaxy is inferred 
from its redshift. The measured redshift contains a contribution 
from the expansion of the Universe, along with a peculiar veloc- 
ity which is induced by inhomogeneties in the density field around 
the galaxy. Thus the position inferred from the redshift is not nec- 
essarily the true position. The distortion of the clustering pattern 
resulting from peculiar velocities is referred to as the redshift space 
distortion. On large scales, coherent motions of galaxies from voids 
towa rds overdense regions lead to a boost in the clustering ampli- 
tude jKaisejl 19871) : 



f(r) 3 5 



(3) 



where £(s) is the spherically averaged, redshift space correlation 
function, and £(r) is its equivalent in real space (i.e. without the 
contribution of peculiar velocities). Eq. l[3} holds in linear pertur- 
bation theory in the distant observer approximation when gradients 
in the bulk flow and the effect of the velocity dispersion are small 
JCole et alj[l994l: IScoccimarroll2004l) . Strictly speaking, these ap- 
proximations apply better on large scales. The parameter j3 is re- 
lated to the linear growth rate, D, through 



1 dlnD 



b din a ' 

fim(z) 7 



(4) 



(5) 



where a is the expansion factor. The approximation in Eq. Q is 
valid for an o pen cosmolog y, in which 7 is trad itionally approxi- 
mated to 0.6 Jpeebledfl980l) . lLahav et alj dl99lh showed that this 
approximation should be modified in the case of a CDM model 
with a cosmological constant, to disp lay a weak dependence on A. 
iLue. Scoccimarro & Starkmarj J2004I) pointed out that the value of 
7 allows one to differentiate between modified gravity and dark en- 
ergy, since 0(z) — O m (a) 2//3 /6 for DGP gravity models, while 
(3(z) ~ f2 m (a) 5 / 9 /6 for a flat Universe with a cosmological con- 
stant. 

On small scales, the randomised motions of galaxies inside 
virialised structures lead to a damping of the re dshift space corre - 
lation function and a drop in the ratio £ (s) /£ (r) dCole et al.|[l994l) . 

The impact of peculiar velocities on the clustering of galax- 
ies is clearly seen in f (?v, rv). The top panels of Fig. [Tol 
show the correlation function of Ha emitters selected to have 
log(Ftf a [erg s _1 cm" 2 ]) > -16 and EW ohs > 100A (left) and 
H-band selected galaxies with Hab < 22 (right). In the top and 
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Figure 10. The two point correlation function, measured in redshift space, plotted in bins of pair separation parallel (tv) and perpendicular (r CT ) to the line 
of sight, 7V), for Ha emitters (left-hand panels) and H-band selected (right-hand panels) galaxies in the Millennium simulation. The samples used are 

those plotted in Fig. [8] The pair counts are replicated over the four quadrants to enhance the visual impression of deviations from circular symmetry. The 
Ha catalogue has a limiting flux of \og(Fjj a [erg s —1 cm -2 ]) > —16 and an equivalent width cut of EW ^ B > 100A; the H-band magnitude limit is 
Has = 22. The contours show where log(£(r CT , Tv)) = [0.5, 0.0, —0.5, —1.0, —1.5], from small to large pair separations. The upper panels show the 
correlation function measured in fully sampled catalogues without redshift errors. The middle panels show how redshift errors change the clustering pattern. 
Representative errors for the two redshift measurements are used: a z = 10 -3 for the slitless case (Ha emitters), and <r z = 2 X 10~ 4 for the slit based 
measurement (H-band selected). In the upper and middle panels, all the galaxies are used to compute the correlation function. In the bottom panels, only 33% 
of the galaxies are used in each case, which is indicative of the likely redshift success rate for a survey from space. 



12 A. Or si et al. 



middle rows of Fig.[lO] all galaxies are used down to the respective 
flux limits. To obtain clustering in redshift space, we use the distant 
observer approximation and give the galaxies a displacement along 
one of the cartesian axes, as determined by the component of the 
peculiar velocity along the same axis. Without peculiar velocities, 
contours of constant clustering amplitude in £(r CT , r n ) would be cir- 
cular. In redshift space, the clustering of H-band selected galaxies 
exhibits a clear signature on small scales of a contribution from 
high velocity dispersion systems - the so called "fingers of God". 
This effect is less evident in the clustering of the Ha sample, as 
these galaxies avoid massive haloes, as shown in Fig. [8] On large 
scales, the contours of equal clustering are flattened due to coher- 
ent flows. Simila r distortions have be en measured in surveys such 
as the 2dFGRS faawkins et alj[2003h and the VLT-VIMOS deep 
survey dGuzzo et alj2008h . 

In practice, the measured correlation functions will look some- 
what different to the idealised results presented in the top row of 
Fig. [TO] The redshift measurements will have errors, and the errors 
for slitless spectroscopy are expected to be bigger than those for 
slit-based spectroscopy (Euclid-NIS team, private communication). 
We model this by adding a Gaussian distributed velocity, v r , to the 
peculiar velocities following 5z = (1 + z)v r /c. The dispersion of 
the Gaussian is parametrized by a z = (<5z 2 ) 1/,2 /(l + z). We show 
the impact on the predicted clustering of adding illustrative redshift 
uncertainties to the position measurements in the middle and bot- 
tom panels of Fig.QJJ] For Ha-emitters, we chose a fiducial error of 
a z — 10 -3 , based on simulations by the Euclid NIS team. The 
errors on the slit-based redshifts are expected to be at least a factor 
of 2 times smaller than the slitless errors, so we set tr z = 5x 10 -4 
for the Bab selected sample. The impact of the redshift errors is 
most prominent in the case of the Ha sample, where the contours of 
constant clustering become more elongated along the line-of-sight 
direction. 

A measure of how well bulk flows can be constrained can be 
gained from the accuracy with which (3 can be measured (Eq. ©)■ 
We estimate j3 by applying Eq. 10 to the ratio of the redshift 
space to real space correlation function on pair separations between 
15 — 30/i -1 Mpc, which is close to the maximum pair separation 
out to which we can reliably measure clustering in the Millennium 
simulation volume. The introduction of redshift errors forces us to 
apply Eq. $3$ to the measurements from the Millennium simulation 
on larger scales than in the absence of errors. We note that the ratio 
is noisy even for a box of the volume of the Millennium, and in 
practice we average the ratio by projecting down each of the carte- 
sian axes. The real space correlation function is difficult to estimate 
on large scales, so a l ess direct approach would be applied to actual 
survey data (see e.g. iGuzzo et alj2008t) . Hence, our results will be 
on the optimistic side of what is likely to be attainable with future 
surveys. Ideally, we would like to apply Eq. $5$ on as large a scale 
as possible. Kaiser's derivation assumes that the perturbations are 
in the linear regime. 

We solve th e integral for the growth rate D in Eq. {4]l (see 
lLahav et al.l 199 lh and use this exact result with the value of the bias 
b measured for each galaxy sample to get the theoretical value Pu n . 
Table [2] shows the comparison between /3 m , the measured value of 
/3 in the simulation, and target theoretical value Pun- Two different 
selection cuts are chosen for both Ha and H-band samples to cover 
a range of survey configurations: log(i-£r a [erg s -1 cm -2 ]) > 
[-15.4, -16.0] for Ha samples and Hab < [22, 23] for the mag- 
nitude limited samples. All the mock catalogues studied return a 
value for j3 m which is systematically below the expected value, 
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Figure 11. The effective bias (top panel), number density of galaxies (mid- 
dle panel) and the product nP (bottom panel) as functions of redshift, where 
P is measured at wavenumber k = 0.2 Mpc/h. The solid lines show the 
predictions for the Bau05 (r) model and the BowO 6 ( r ) model is shown 
using dashed lines. The two columns show different Ha and H-band se- 
lections: In the first column the Ho sample is defined by a limiting flux 
of log^cjergs- 1 cm- 2 ]) > -16andEW obs > 100A (red curves). 
The magnitude limited sample has H^g < 22 (blue curves). In the sec- 
ond column the Ho sample has log(i 7 j^ Q [erg s -1 cm -2 ]) > —15.4 and 
BW obs > 100A, and the H-band sample has H(AB) < 23. In all panels 
the redshift success rate considered is 100%. 



When redshift errors are omitted and a 100% redshift success 
rate is used, both selection methods seem to reproduce the expected 
value of /3n n to within better than ~ 10%. When redshift errors 
are included, the spatial distribution along the line of sight appears 
more elongated than it would be if the true galaxy positions could 
be used. This leads to an increase in the small scale damping of the 
clustering. However, at the same time contours of constant cluster- 
ing amplitude are pushed out to larger pair separations in the radial 
direction. This results in an increase in the ratio of redshift space 
to real space clustering and an increase in the recovered value of /3. 
When including the likely redshift errors, the values of /3 m found 
are slightly higher than those without redshift errors. This small 
boost in the value of /3 m is greatest in the Ha sample, because of 
the larger redshift errors than in the H-band sample. 

We have also tested the impact of applying different red- 
shift success rates on the determination of /3 m . The lower part 
of Table [2] shows the impact of a 33% redshift success rate. For 
log(F// Q [erg s -1 cm -2 ]) > —15.4, our results for /3 m shows 
that it is unlikely to get a robust estimate of (3 at this flux limit, 
because the smaller number density makes the correlation func- 
tions very noisy, thus making f3 m impossible to be measured cor- 
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Table 2. Values of j3 estimated from the ratio of the redshift space to real space correlation function for the fiducial samples at 2 = 1. We consider Ha 
emitters with fluxes log(_F// a [erg s~ 1 cm" 2 ]) > [—15.4, —16] and H-band selected galaxies with Hab < [22, 23]. The table is divided into two parts. 
The first half assumes a redshift success rate of 100% and the second a 33% redshift success rate. Each segment is divided into two, showing the impact on (3 
of including the expected redshift uncertainties: a, = 10~ 3 for Ha emitters and tj z = 5 X 10~ 4 for H-band selected samples. Column (1) shows /3 lin , the 
exact theoretical value of (3 obtained when using Eq. {4}- Column (2) shows /3 m , the value of /3 measured in the simulation including the 1 a error. Column 
(3) shows the fractional error on /3 m using the Millennium volume. Column (4) shows the fractional error on j3 m obtained when using mock catalogues from 
the BASICC simulation. 
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rectly. In contrast, the impact of a 33% of success rate in the 
log(FHa [erg s -1 cm -2 ]) > —16 sample is negligible. The j3 m 
values calculated using the H-band catalogues are also mostly un- 
affected. When redshift uncertainties are considered, as before, the 
f3 m values are closer to the theoretical Pn n . Hence redshift uncer- 
tainties will contribute to the uncertainty on f3 m , but they still per- 
mit an accurate determination of /3, provided they do not exceed 
o z = 10- 3 . 

The noisy correlation functions for the configurations with 
log(Fffd [erg s _1 cm -2 ]) > —15.4 and sampling rate of 33% 
produce measurements of j3 m with large errors. The mock cata- 
logues used so far in this section were created from the Millennium 
simulation, which has Vmui = 500 3 [Mpc/h] 3 . This volume is al- 
most three orders of magnitude smaller than the volume expected 
in a large redshift survey from a space mission like Euclid(see 
next section). In order to test the impact of using this limited vol- 
ume when measuring j3 m and its error, we plant the Bow0 6 (r) 
mod el into a larger volum e using the BASICC N-body simula- 
tion l lAnguloetai]|2008ah . which has a volume almost 20 times 
larger than the Millennium run (I/basicc = 1340 3 [Mpc/h] 3 ). 
The errors on {3 m shown in Table [2] are expected, to first order, 
to scale with the error on the power spectrum (see Eq. l[6j below). 
If we compare two galaxy samples with the same number den- 
sity but in different volumes, then the error on j3 m should scale 
as 5f3 oc 1/ \fV , where V is the volume of the sample. 

The only drawback of using the BASICC simulation is that 
the mass resolution is worse than in the Millennium simulation. 
Haloes with mass greater than 5.5 x 10 11 Mq / h can be resolved 



in the BASICC simulation. The galaxy samples studied here are 
hosted by haloes with masses greater than ~ 8 x 10 1() M@/h, 
so if we only plant galaxies into haloes resolved in the BASICC 
run then we would miss a substantial fraction of the galaxies. To 
avoid this incompleteness, those galaxies which should be hosted 
by haloes below the mass resolution limit are planted on randomly 
selected ungrouped particles, i.e. dark matter particles which do 
not belong to any halo. This scheme is approximate and works best 
if the unresolved haloes have a bias close to unity, i.e. where the 
bias is not a strong function of mass. This is almost the case in the 
application of this method to the BASICC run, so the clustering 
amplitude appears slightly boosted for all the configurations 
studied here. However, since we only want to study the variation 
in the error on /3 m when using a larger volume, we apply the same 
method described above to measure /3 m in the galaxy samples 
planted in the BASICC run. 

As shown in the fourth column of Table [2] we find that for 
all the Ha configurations here studied the error on f3 m obtained 
when using the BASICC simulation is a factor 1-6 smaller than that 
found with the Millennium samples. The H-band samples, on the 
other hand, have errors roughly ~ 4 times smaller in the BASICC 
volume compared to the Millennium volume, which is what we 
expect if we assume that the error on /3 m scales with l/^/V. 

The Euclid survey will cover a geometrical volume of ~ 
90[Gpc/h] 3 with an effective volume of around half of this (see 
next section). We expect that Euclid should meausre /3 m with an 
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accuracy around 5 times smaller than that estimated for the galaxy 
samples planted into the BASICC simulation. 



4.4 Effective survey volume 

Ongoing and future surveys aim to measure the baryonic acoustic 
oscillation (BAO) signal in the power spectrum of galaxies. The pri- 
mary consideration for an accurate power spectrum measurement is 
to maximize the survey volume in order to maximize the number 
of independent fc-modes. However, because the power spectrum is 
measured using a finite number of galaxies there is an associated 
discreteness noise. The number density of galaxies in a flux limited 
sample drops rapidly with increasing redshift, which means that 
discreteness noise also increases. When the discreteness noise be- 
comes comparable to the power spectrum amplitude, it is difficult 
to measure the clustering signal. This trend is encapsulated in the 
expression for the fractional error on the power spectrum derived 
by Feldman, Kaiser & Peacock (1994): 
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where a is the error on the power spectrum P, V is the geometri- 
cal survey volume and n is the number density of galaxies. When 
the contrast of the power spectrum is high, i.e. nP 1, then the 
fractional error scales as the inverse square root of the survey vol- 
ume. However, in the case that nP ^ 1, the gain in accuracy from 
increasing the survey volume is less than the inverse square root of 
the increased volume. The amplitude of the power spectrum com- 
pared to the discreteness noise of the galaxies used to trace the 
density field is therefore a key consideration when assessing the 
effectiveness of different tracers of the large scale structure of the 
Universe. 

GALFORM gives us all the information required to estimate the 
effective volume of a survey with a given selection criteria (which 
defines the number density of galaxies, n(z), and the effective bias 
as a function of redshift). For simplicity, we use the linear the- 
ory power spectrum of dark matter, which is a reasonable approxi- 
mation on the wavenumber scales studied here. The galaxy power 
spectrum is assumed to be given by P s (k,z) = b{z) 2 P dln (k,z), 
where b(z) is the effective bias of the galaxy sample. We calculate 
the fraction of volume utilized in a given redshift interval following 
Tegmark (1997), 



V e s(k, z) = 



n(z)P g (k,; 



1 +n(z)P g (k,z) 



— dz, 

dz 



(8) 



where all quantities are expressed in comoving coordinates. We cal- 
culate V c ff /V for a range of possible survey configurations consid- 
ering different limits in flux, EW h s , magnitude limit and redshift 
success rate (see Table [3}. The redshift range is chosen to match 
that expected to be set by the near-infrared instrumentation to be 
used in future surveys. 

Fig. [TT] shows the predictions from GALFORM which are 
required to compute the effective volume, for two illustrative 
Ha and H-band selected surveys, covering the current expected 
flux/magnitude limits of space missions. The bias predicted for H- 
band galaxies is at least ~ 30% higher than that for Ha-emitters 
in both panels of Fig. [TT] This reflects the different spatial dis- 
tribution of these samples apparent in Fig. [8] in which is it clear 
that Ha emitters avoid cluster-mass dark matter haloes. The mid- 



dle panel of Fig. [TTJ shows the galaxy number density as a func- 
tion of redshift for these illustrative surveys. For the Ha selection, 
the models predict very different number densities at low redshifts, 
as shown also in Fig [7] For z > 1 the Bow 6 (r) model pre- 
dicts progressively more galaxies than the Bau05 (r) model for 
the H-band selection. Overall, the number density of galaxies in 
the H-band sample at high redshift is much lower than that of Ha 
emitters. However, we remind the reader than these scaled models 
match the H-band counts but have a shallower redshift distribution 
than is suggested by the observations. The bottom panel of Fig. II II 
shows the power spectrum times the shot noise, nP, as a function 
of redshift. A survey which efficiently samples the available vol- 
ume will have nP > 1. The slow decline of the number density of 
Ha galaxies with redshift in the Bau05 (r) model is reflected in 
nP > 1 throughout the redshift range considered here, whereas in 
the Bow0 6 (r) model, the Ha sample has a very steeply falling 
nP curve, with nP < 1 for z > 1.5. The predictions of nP 
for the H-band are similar in both models, dropping below 1 at 
z ~ 1.3 - 1.5. 

The predictions for the bias, number density and power spec- 
trum of galaxies plotted in Fig. [TTJ are used in Eq. l[8) to calculate 
the effective volume, which is shown in Fig. [12] The top panels 
show the differential V c g/V calculated in shells of Az — 0.1 for 
redshifts spanning the range z — [0.5, 2]. The bottom panels of 
Fig. Q/2] show the cumulative V c h contained in the redshift range 
from z = 0.5 up to z = 2. We follow previous work and use 
the amplitude of the power spectrum at k = 0.2/iMpc" 1 , which 
roughly corresponds to the centre of the wavenumber range over 
which the BAO signal is measured. We show the result for the fidu- 
cial survey selections with different redshift success rates, 100% 
and 33%. In addition, for the H-band selected survey, we also show 
the results obtained with the alternative approach discussed in the 
previous section, in which the galaxies in the Bow0 6 sample are 
diluted by a factor of 0.63. 

In general, the effective volume is close to the geometrical vol- 
ume at low redshifts. This is because nP ^> 1 at these redshifts. 
In the top panels of Fig.Q/2] where the differential V e ff /V is plotted 
in shells of Az — 0.1, we see that shells at higher redshifts cover 
progressively smaller differential effective volumes. This is due to 
the overall decrease in the number density of galaxies beyond the 
peak in the redshift distribution (see Figs. 151 171 andll It. which wins 
out over the more modest increase in the bias of the galaxies picked 
up with increasing redshift. The bottom panels of Fig.[T2lshow the 
same effect: at higher redshifts, the gain in effective volume is much 
smaller than the corresponding gain in the geometrical volume of 
the survey. We remind the reader that our calculation for the effec- 
tive volume in the H-band using models with rescaled luminosities 
is likely to be an underestimate, as these models underpredict the 
observed high redshift tail of the redshift distribution. A better es- 
timate is likely to be provided by the Bow0 6 (d) model, in which 
the number of galaxies is adjusted by a making a random sampling, 
rather than by changing their luminosities. This case is shown by 
the green curves in Fig. [12] 

The calculations presented in Fig. [12] are extended to a range 
of survey specifications in Table [3] This table shows calculations 
for two different redshift ranges: < z < 2 and 0.5 < z < 2, 
and includes also the effect of applying different selection criteria 
and redshift success rates to Ha and H-band surveys. An Ha sur- 
vey with a limiting flux of log(FH a [eig s _1 cm -2 ]) > —15.4, an 
equivalent width EW b s > 100A and a sampling rate of 0.33, sim- 
ilar to the baseline spectroscopic solution for Euclid, would have 
a very small V a a/V ~ 0.04 for the redshift interval z = 0.5 — 2. 
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Figure 12. The effective volume of Ha- and H-band selected samples. The left-hand panels show results for Bau05 (r ) model and the right-hand panels 
show the BowO 6 ( r ) model; in the latter case, the effective volume for a randomly diluted sample of galaxies from the original BowO 6 model is also shown. 
The upper row shows the effective volume divided by the geometrical volume in redshift shells of width Az = 0.1; the power spectrum at k = 0.2/iMpc~ 1 is 
used to compute the effective volume (see text). The lower panels show the cumulative effective volume per steradian starting from 2 = 0.5 and extending up to 
the redshift at which the curve is plotted. Red curves show the results for Ha selected galaxies with \og(F[j a [erg s _1 cm -2 ]) > — 16 and EW i, s > lOOA. 
The solid red line shows the result of applying a redshift success of 33%, whereas the red dashed line assumes a 100% success rate. The blue lines show the 
results for an H-band magnitude selected survey with Hab < 22. As before, the solid blue line shows the results for a sampling rate of 33%, and the dashed 
line assumes 100% sampling. The green lines show the results using the BowO 6 model diluted (BowO 6 (d) ) to match the observed number counts; as before 
solid and dashed show 33% and 100% success rates, respectively. The black solid curves in the bottom panels show the total comoving volume covering the 
redshift range shown. 



In contrast, an H-band survey with J/ab < 22 and a sampling 
rate of 0.33, an alternative spectroscopic solution for Euclid, 
has Vcff/V = 0.19 - 0.27 or even up to V e s/V = 0.43 in the 
case of the diluted model. To reach a comparable effective vol- 
ume, a Ha survey would need to reach a flux limit of at least 
log(Ff/ Q [erg s _1 cm -2 ]) > —16 (at the same equivalent width 
cut and redshift success rate). 

The calculation of the effective volume also allows us to make 
an indicative estimate of the accuracy with which the dark energy 
equation of state parameter, w, can be measured for a given sur- 
vey configuration. Angulo et al. (2008a) used large volume N-body 



simulations combined with the GALFORM model to calculate the 
accuracy with which the equation of state parameter w can be mea- 
sured for different galaxy samples. They found a small difference 
(~ 10%) in the accuracy with which w can be measured for a 
continuum magnitude limited sample and an emission line sam- 
ple with the same number density of objects. Their results can be 
summarised by: 



Aw(%) = 




where Viffis in units of h 3 Gpc 3 and the constant of proportion- 
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Table 3. The effective volume of Ha- and H-band selected surveys for different selection criteria. We evaluate a given survey configuration in terms of its 
effective volume in the redshift range < z < 2 (top) and 0.5 < z < 2 (bottom), which is expressed as a fraction of the geometrical volume over the same 
redshift interval. The first column shows the galaxy selection method used, Ha for an Ha selected survey with a minimum flux limit and EW a ^ s cut or /Jab 
for an H-band magnitude limited survey. The second column shows the H-band magnitude limit chosen in a given configuration, where applicable. The third 
column shows the minimum Ha flux chosen, again where applicable, and the fourth column the minimum EW ^, a cut applied. The fifth column shows the 
redshift success rate assumed. Columns 6, 7 and 8 show the fractional effective volume obtained for a given configuration in the Bau05 , Bow06 and the 
diluted version of the BowO 6 model respectively. Finally, columns 9, 10 and 1 1 show our estimate of the corresponding percentage error on the determination 
of io, the dark energy equation of state parameter, for the Bau05 , BowO 6 and diluted BowO 6 models, respectively. 
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0.8 


0.9 


0.0 


Ha 




-15.40 





1.00 


0.21 


0.15 


0.00 


0.8 


0.9 


0.0 


Ha 




-15.70 


100 


0.33 


0.18 


0.07 


0.00 


0.9 


1.2 


0.0 


Ha 




-15.70 


100 


1.00 


0.43 


0.17 


0.00 


0.5 


0.8 


0.0 


Ha 




-15.70 





1.00 


0.44 


0.17 


0.00 


0.5 


0.8 


0.0 


Ha 




-16.00 


100 


0.33 


0.33 


0.21 


0.00 


0.6 


0.7 


0.0 


Ha 




-16.00 


100 


1.00 


0.62 


0.41 


0.00 


0.4 


0.5 


0.0 


Ha 




-16.00 





1.00 


0.63 


0.41 


0.00 


0.4 


0.5 


0.0 


H(AB) 


21 






0.33 


0.09 


0.10 


0.19 


1.2 


1.1 


0.8 


H(AB) 


21 






1.00 


0.14 


0.18 


0.35 


1.0 


0.8 


0.6 


H(AB) 


22 






0.33 


0.19 


0.27 


0.43 


0.8 


0.6 


0.5 


H(AB) 


22 






1.00 


0.30 


0.44 


0.67 


0.6 


0.5 


0.4 


H(AB) 


23 






0.33 


0.38 


0.55 


0.67 


0.6 


0.4 


0.4 


H(AB) 


23 






1.00 


0.57 


0.77 


0.86 


0.5 


0.4 


0.3 



ality (in this case, 1.5) depends on which cosmological parameters 
are held fixed; in the present case models are considered in which 
the distance to the epoch of last scattering is fixed as the dark en- 
ergy equation of state parameter varies. We obtain an estimate of 
the accuracy with which w can be measured by inserting V^ffinto 
Eq.[9] which is shown in Table[3] for the Bau05 and BowO 6 mod- 
els. 



5 DISCUSSION AND CONCLUSIONS 

In this paper we have presented the first predictions for clustering 
measurements expected from future space-based surveys to be con- 
ducted with instrumentation sensitive in the near-infrared. We have 
used published galaxy formation models to predict the abundance 
and clustering of galaxies selected by either their Ha line emission 



or H-band continuum magnitude. The motivation for this exercise 
is to assess the relative performance of the spectroscopic solutions 
proposed for galaxy surveys in forthcoming space missions which 
have the primary aim of constraining the nature of dark energy. 

The physical processes behind Ha and H-band emission are 
quite different. Ha emission is sensitive to the instantaneous star 
formation rate in a galaxy, as the line emission is driven by the 
number of Lyman continuum photons produced by massive young 
stars. Emission in the observer frame H-band typically probes the 
rest frame i?-band for the proposed magnitude limits and is more 
sensitive to the stellar mass of the galaxy than to the instantaneous 
star formation rate. 

The GALFORM code predicts the star formation histories of a 
wide population of galaxies, and so naturally predicts their star for- 
mation rates and stellar masses at the time of observation. Variation 
in galaxy properties is driven by the mass and formation history of 
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the host dark matter halo. This is because the strength of a range 
of physical effects depend on halo properties such as the depth of 
the gravitational potential well or the gas cooling time. This point 
is most striking in our plot of the spatial distribution of Ha and ri- 
band selected galaxies, Fig. [8] This figure shows remarkable differ- 
ences in the way that these galaxies trace the underlying dark mat- 
ter distribution. Ha emitters avoid the most massive dark matter 
haloes and trace out the filamentary structures surrounding them. 
The H-band emitters, on the other hand, are preferentially found in 
the most massive haloes. This difference in the spatial distribution 
of these tracers has important consequences for the redshift space 
distortion of clustering. 

In this pape r we have studied tw o p ublished galaxy form ation 
models, those of iBaugh et ail J2005h and lBower etaT] I2OO6I) . The 
models were originally tuned to reproduce a subset of observations 
of the local galaxy population and also enjoy notable successes at 
high redshift. We presented the first comparison of the model pre- 
dictions for the proper t ies of Ha emitters, extendi ng the work of 
iLe Delliou etalT < l2005l [2006h and lOrsi et ail J2008h who looked at 
the nature of Lyman-alpha emitters in the models. Observations of 
Ha emitters are still in their infancy and the datasets are small. The 
model predicitions bracket the current observational estimates of 
the luminosity function of emitters and are in reasonable agreement 
with the distribution of equivalent widths. 

The next step towards making predicitions of the effectiveness 
of future redshift surveys is to construct moc k catalogues from the 
galaxy formation models (see iBaughl [20081) . Using the currently 
available data, we used various approaches to fine tune the models 
to reproduce the observations as closely as possible. The main tech- 
nique was to rescale the line and continuum luminosities of model 
galaxies; another approach was to randomly dilute or sample galax- 
ies from the catalogue. This allowed us to better match the number 
of observed galaxies. The resulting mocks gave reasonable matches 
to the available clustering data around z ~ 2. Our goal in this paper 
was to make faithful mock catalogues. The nature of Ha emitters 
in hierarchical models will be pursued in a future paper. 

The ability of future surveys to measure the large scale struc- 
ture of the Universe can be quantified in terms of their effective 
volumes. The effective volume takes into account the effect of the 
discreteness of sources on the measurment of galaxy clustering. If 
the discreteness noise is comparable to the clustering signal, it be- 
comes hard to extract any useful clustering information. Once this 
point is reached, although the available geometrical volume is in- 
creased by going deeper in redshift, in practice there is little point 
as no further statistical power is being added to the clustering mea- 
surments. The error on a power spectrum or correlation function 
measurement scales as the inverse square root of the effective vol- 
ume. In the case of flux-limited samples, the number density of 
sources falls rapidly with increasing redshift beyond the median 
redshift. Even though the effective bias of these galaxies tends to 
increase with redshift, it does not do so at a rate sufficient to offset 
the decline in the number density. The GALFORM model naturally 
predicts the abundance and clustering strength of galaxies needed 
to compute the effective volume of a galaxy survey. 

The differences in the expected performance of Ha and H- 
band selected galaxies when measuring the power spectrum is 
related to the different nature of the galaxies selected by these 
two methods. Ha emitters are active star forming galaxies, which 
makes them have smaller bias compared to H-band selected galax- 
ies. Their redshift distribution is also very sensitive to the details 
of the physics of star formation: The effect of a top-heavy IMF in 
bursts in the Baud 5 model boosts the number density of bright 



emitters, making the redshift distrubtion of Ha emitters very flat 
and slowly decreasing towards high redshifts, in contrast to the pre- 
dictions of the Bow0 6 model, where a sharp peak at z ~ 0.5 and 
a rapid decrease for higher redshifts is found. H-band galaxies are 
less sensitive to this effect, and the redshift distributions are similar 
in both models. This is why the balance between the power spec- 
trum amplitude (given by the effective bias) and the number density 
is translated in two different effective volumes for Ha and H-band 
selected galaxies. 

Although there are differences in detail between the model 
predictions, they give similar bottom lines for the effective vol- 
umes of the survey configurations of each galaxy selection. Com- 
paring the spectroscopic solutions in Table [3] a slit based sur- 
vey down to Hab = 22 would sample 4-10 times the ef- 
fective volume which could be reached by a slitless survey to 
log(Fff Q [org s -1 cm -2 ]) = —15.4, taking into account the likely 
redshift success rate. To match the performance of the H-band sur- 
vey, an Ha survey would need to go much deeper in flux, down to 
log(F HQ [erg s" 1 cm -2 ]) = -16. 

We have also looked at the accuracy with which Ha emitters 
and H-band selected galaxies will be able to measure the bulk mo- 
tions of galaxies and hence the rate at which fluctuations are grow- 
ing, another key test of gravity and the nature of dark energy. All 
of the samples we considered showed a small systematic difference 
between the measured growth rate and the theoretical expectation, 
at about the la level. The error on the growth rate from an Ha 
survey with log(i<Ha:[erg s _1 cm -2 ]) > —15.4 was found to be 
about three times larger than that for a sample with Hab < 22. 
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